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Abstract 

We describe a variation of the iterative closest point (ICP) algorithm for aligning two point sets under a set of 
transformations. Our algorithm is superior to previous algorithms because (1) in determining the optimal align- 
, ment, it identifies and discards likely outliers in a statistically robust manner, and (2) it is guaranteed to converge 

to a locally optimal solution. To this end, we formalize a new distance measure, fractional root mean squared 
distance (frmsd), which incorporates the fraction of inliers into the distance function. We lay out a specific im- 
plementation, but our framework can easily incorporate most techniques and heuristics from modern registration 
algorithms. We experimentally validate our algorithm against previous techniques on 2 and 3 dimensional data 
exposed to a variety of outlier types. 
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1 Introduction 

Aligning an input data set to a model data set is fundamental to many important problems such as scanned model 
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reconstruction O, structural biochemistry [25|, and medical imaging |12|. The input data and the model data are 
^ typically given as a set of points. A point set may arise from laser scans of a 3D or 2D model, coordinates of atoms 
I— —I , in a protein, positions of a lesions from a medical patient, or some other sparse representation of data. However, the 

^ ' relative positions of these point sets is not known, making the task of registering them nontrivial. 
j>. A popular approach to solving this problem is known as the iterative closest point (ICP) algorithm Q |3l which 

OO alternates between finding the optimal correspondence between points, and finding the optimal transformation of one 
^ point set onto the other. As both steps reduce the distance between the point sets, this process converges, but only 
to a local minimum. The effectiveness, simplicity, and generality of this algorithm has led to many variations 1 26] 
O , l20l[T9l l4ll5l fT0lfT8ll25ll . For instance, the set of legal transformations can be just translations, all rigid motions, 
or all affine transformations. Other versions replace the optimal correspondence between points by aligning each 
data point to the closest point on an implicit surface of the model data |3|. Or the traditional squared distance can 
O be replaced with a more efficient and stable approximation to the squared distance function 1 15|. A now slightly 
outdated, but excellent survey l20l evaluates many of these techniques. 

Yet, because ICP only converges to a local minimum, there has been considerable work on expanding and stabi- 
lizing the funnel of convergence — the set of initial positions for which ICP converges to the correct local minimum. 
5^ Others have attempted to solve the global registration problem flTlfTTl . where for any initial alignment they attempt 
to find the optimal alignment between two point sets. This is often done in two steps. First find a rough global 
alignment by corresponding certain distinguishable feature points. Second refine the alignment with ICP. 
However, all of these algorithms are vulnerable to point sets with outliers. Outliers may result from: 

• measurement error, 

• spurious data that was ignored or missed in the model, 

• partial matches because the point sets represent overlapping, but not identical pieces of the same object, 

• interesting changes in the underlying object between time steps or among comparable objects. 

In short, outliers are unavoidable. Because ICP will find correspondences for all points, and then find the optimal 
transformation for the entire point set, the outliers will skew the alignment. Many heuristics have been suggested 1 5| 
Hj including only aligning points within a set threshold k26,,22J, but most of these techniques are not guaranteed 



to converge, and thus can possibly go into an infinite loop, or require an expensive ciieck to prevent this. If the 
fraction / of points which are outliers is known, then Trimmed ICP |4| can be used to find the optimal alignment of 
the most relevant fraction / of points. This algorithm is explained in detail in Section l3!T] However, this fraction 
is rarely known a priori. If an alignment is given then RANS AC type methods j9| |2l can be used to determine a 
good threshold for determining these outliers. There are also many ad hoc solutions to this problem. However, if the 
outliers are excluded from the data set in a particular alignment, then the alignment is no longer optimal, since those 
outliers which were removed influenced how the points were initially aligned. 

1.1 Our Contributions 

Our solution to these problems is to incorporate the fraction of points which are outliers into the problem statement 
and into the function being optimized. To this end, this paper makes the following contributions: 

• We formalize a new distance measure between point sets which accounts for outliers: FRMSD. This definition 
extends the standard RMSD to account for outliers (Section|2li. 

• We provide an algorithm. Fractional ICP, to optimize FRMSD (Section l3^ which we prove to converge to a 
local optimum in the correspondence, transformation, and fraction of outliers (Section|4li. 

• We give mathematical intuition for why FRMSD aligns data points which are more likely to be inliers than 
outliers (Section |5] and Section|6ll. 

• Finally, we empirically demonstrate that Fractional ICP identifies the correct alignment while simultaneously 
determining the outliers on several data sets (Section Q. 

2 Fractional RMS Distance 

Consider two point sets M G M.'^. The goal of this paper is to align an input data set D to a model data set M 
under some class of transformations, T. These may include rotations, translations, scaling, or all affine transforms. 
We assume that these point sets are quite similar and there exists a strong correspondence between most points in 
the data. There may, however, be outliers, points in either set which are not close to any point in the other set. Our 
goal is to define and minimize over a set of transformations a relevant distance between these two point sets. To aid 
in this, we define a matching function i_l : D ^ M, which unless defined otherwise or given as a parameter, simply 
matches each point of D to the closest point in M. 

Definition 2.1. /RMSD ] The root mean squared distance (or RMSDj between two point sets D,M C M'^, for a 
given matching : D ^ M is the square root of the average squared distance between matched points: 



When convenient we sometimes write RMSd(L', M), letting /i match every point in D to the closest point in M. 

Problem 2.1. [minimizing RMSD ] Given a model point set M and an input data point set D where D, M C M.'^, 
compute the transformation T €z T to minimize RMSD(T(L'), M): 
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Problem [2.1l is algorithmically difficult because as T varies, so does the optimal matching /i. Also, RMSD is quite 
susceptible to outliers because the squared distance gives a large weight to outliers. To counteract this, a specific 
fraction / G [0, 1] of points from D can be used in the alignment and in the distance measure between the point 
sets. These f\D\ points can be chosen to solve Problem [2. ll bv selecting the points which have the smallest residual 
distance r = \ \p — fJ.{p)\\. Let Dj = {p £ D \ \Df \ = [f\D\\ and RMSD{Df,M) is minimized}. But what fraction 
of points should be used? We can always make RMSd(Dj, M) = by setting / = l/\D\ and aligning any single 
point exactly to another point. So RMSD by itself is no longer a viable measure. For this reason, we propose a new 
distance measure. 

Definition 2.2. /^FRMSD ] The fractional root mean squared distance (or FRMSDj is defined as follows: 



We will empirically and mathematically justify a value of A in Section IT!?! and Section 15] Again, it is sometimes 
convenient to let frmsd(L', M, /) = frmsd(L', M, /, //) because /x can still be determined by D and M. This 
leads to a new, more relevant problem. 

Problem 2.2. [minimize FRMSD ] Given a model point set M and an input data point set D where D,M C M."^, 
compute the transformation T £ T and fraction f € [0, 1] to minimize FRMSD (r(D), M, f): 



Intuitively, the -j^ term serves to balance the RMSD term, -j^ goes to oo as / goes to 0, while the RMSD goes to 
as / goes to 0. FRMSD, unlike RMSD over any fraction of the data points, cannot equal unless some fraction 
of points align exactly. Of course, one point can always align exactly to another point in the other subset, so in the 
implementation we restrict that / > 1/|D|, although this case is degenerate and almost never happens in practice. 
Some arbitrary nonzero minimum value of / can be set as desired. 

3 Algorithms 

In this section we describe algorithms to solve Problem l2!2l 
3.1 Trimmed ICP 

The Trimmed ICP algorithm 13.11 assumes / to be given and computes a transformation T G T of a point set D to 
minimize RMSD between Df and a model point set M. When / = 1, this is the ICP algorithm | IJ. The algorithm 
iterates between computing the optimal matching /i and the optimal transform T over the f\D\ closest points. This 
algorithm has been shown |4l to converge to a local minimum of RMSd(Dj, M) over all rotations, translations, and 
matchings. 

In practice, the comparison on line 8 of Algorithm 13.11 (^j = /Xi-i), can be replaced by checking whether the 
RMSd(Z), M) value decreases by less than some threshold at each step. TrICP, however, does not completely solve 
Problem l2.2t FRMSd(D, M) is not minimized with respect to /. It has been suggested [4| to run TrICP for several 
values of /. In fact, those same authors hypothesize that the FRMSD (L>, M, /) values returned from TrICP(D, M, /) 
are convex in /, allowing them to perform a golden ratio search technique to avoid checking all values of /. This 
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Algorithm 3.1: TrICP(L>, M, /) 

1: Compute = arg min RMSd(L>, M, /xq). 

2: i^O. 

3: repeat 

4: i ^ i + l. 

5: Compute minimizing RMSd(L)j,M) such that!)/ C D and \Df\ = [/|-D|J. 
6: Compute T £T minimizing RMSd(I)/, M). D ^ T{D). 
7: Compute in: D ^ M minimizing RMSd(£', M). 

8: until {jii = Hi-i) 



hypothesis is easily shown to be false. Also this technique fails to guarantee that the solution is a local minimum 
in the space of all transformations, matchings, and fractions. The value attained by TrICP depends on the initial 
position of D and M. Thus, for the transformation T calculated by TrICP, potentially another fraction / can give a 
smaller value of rmsd(L'j, M) or of frmsd(L), M, /). 

3.2 Fractional ICP 

A simple modification of TrICP, shown in Algorithm 13. 21 will actually provide the desired local minimum. We refer 
to this algorithm as Fractional ICP or FICP. 



Algorithm 3.2: FICP(L', M) 
1: Compute /io = arg min RMSDfD, M, ^uq). 

^lo■.D^M 

2: Compute /o G [0, 1] minimizing FRMSd(L', M, /o, /xq)- 
3: i ^ 0. 
4: repeat 

5: i^i + l. 

6: Compute Df minimizing RMSd(L'/, M) such that Dj C D and \Df\ = [/|-D|J. 
7: Compute T G T minimizing RMSd(£)j, M). D ^ T{D). 
8: Compute fii : D ^ M minimizing RMSd(D, M, fii). 
9: Compute /j £ [0, 1] minimizing FRMSd(L>, M, fi,iJ,i). 
10: until {ui = and fi = fi-i) 



Again, in practice, the comparison on line 10 of Algorithm l3.2l can be replaced be checking whether the FRMSd(Z), M, /) 
value decreases by less than some threshold at each step. 

3.3 Implementation 

To implement TrICP we need 3 operations: computing the matching, computing the subset Df, and computing the 
transformation. To implement FICP we need the additional step of computing the fraction. 

3.3.1 Computing the Matching 

For each point p G we need to find its closest point m G M. Since M is fixed through the algorithm, we can 
precompute a hierarchical data structure which can quickly return the nearest neighbor. We implemented a kd- 
tree, at a one-time, initial cost of 0(|M| log \ M\). The nearest neighbor can be returned in 0(log |M|) time. This 
operation is required for each point in \D\. So the matching can be computed in 0(|-D| log \M\). This is in general 
the most time consuming step of the algorithm. 
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We could replace the fcd-tree with a d^-tree 1151 . or when appropriate use point to surface matching as in |l3l or 
II20I . but we would loose our guarantee of convergence. 

3.3.2 Computing the Subset Df 

The set Df = {p £ D \ \Df \ = [f\D\\,RMSD{Df,M) is minimized} is defined by the [/|-D|J points with the 
smallest residual distances r = \ \p — fJ,{p)\\. This observation implies the following algorithm. Compute and sort all 
residual distances and let Dj he the f\D\ points with the smallest residual distances. The runtime is bounded by the 
sorting which takes 0(|-D| log time. 

3.3.3 Computing the Transformation 

The set of allowable transformations, T, may include rotations, translations, and scalings. Or it may be as general 
as all affine transformations. When we consider rotations, translations, and scalings, Problem l2.1l is written: 



For a fixed matching, fi, this is known as the absolute orientation problem, and can be solved exactly 1 14| in O(n^) 
time. When d < 3, this can be solved in 0{n) time |24|. There are actually 4 distinct algorithms — one using 
rotation matrices and the SVD jTJl, one using rotation matrices and the eigenvalue decomposition 1^ . one using 
unit quaternions |8|, and one using dual number quaternions l24ll — ^but all are in practice approximately equivalent 
in run time |6|. We use the simplest technique 1 13 1 which reduces the solution to computing an SVD. 
When T is the set of all affine transformations, Problem [2.1l is written: 



where A is an affine transformation. This reduces to a generic least squares problem that can be solved with a matrix 
inversion. 

3.3.4 Computing the Fraction 

There are only \D\ fractions which we need to consider. Consider the sorted order of the point set D by each point's 
residual distance r = \ \p — fi{p)\\. Each prefix of this ordering represents a distinct fraction. If we maintain the value 
Ylp^Df I \P ~ P fo*" ^^c^ Df we can compute FRMSd(L', M, f) in constant time for a given fraction /. We can 
also update to a point set of size |Z)j| + 1 in constant time by adding the next point in the sorted order to Df. If 
the ith prefix yields the smallest value of FRMSD, then / is set to i/\D\. So this computation takes OdZ?!) time. 

4 Convergence of Algorithm 

In this section we show that Algorithm 13. 2l converges to a local minimum of FRMSD (i^, M, f) over the space of all 
transformations, matchings, and fractions of points used in the matching. This is a local minimum in a sense that if 
all but one of transformations, matchings, or fractions is fixed, then the value of the remaining variable cannot be 
changed to decrease the value of FRMSD (D, M, /). 



Theorem 4.1. For any two points sets D, M G M.'^, Alsorithm \3.2\ converees to a local minimum 0/ FRMSD {T{D) , Af , /, ji) 
over if, T, 11) G [0, 1] x T x {D ^ M}. 
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Proof. Algorithm 13.21 only changes the value of {f,T,fi) when computing the optimal transformation T (line 7), 
computing the optimal matching /u (line 8), or computing the optimal fraction / (line 9). None of these steps 
can increase the value of FRMSd(D, M, /j, /ij), because staying at the current value would retain the value of 
FRMSd(L), M, f, ji), but each can potentially decrease it. (When two possible values of (/, /i, T) degenerately pro- 
duce the same value of FRMSd(D, M, f, /i), we consistently choose the smaller one according to some consistent, 
but arbitrary ordering.) 

FICP terminates only when /Xj = and fi = fi-i. The optimal ti^ansformation computed at iteration i (line 7) 
is a function of the matching of the points and which points are included, which is determined by fi-i. Thus, 
the transformation will only change in iteration i + 1 if /ij or fi change from /di^i or respectively. If /i, = 
and fi = then FICP will terminate, and {fi,T, Hi) will be a local minimum. If it were not, then either f or fi 
would have changed in the last iteration, and FRMSd(Z), M, fi, f) would have decreased or stayed the same in the 
ith iteration. 

Furthermore, FICP terminates in a finite number of iterations, because there are \D\ possible values of / and 
|M| 1^1 possible values of fx, and the algorithm can never be at any of these locations twice. □ 

In practice the convergence is much faster than the upper bound of \D\ ■ |Af|l^l steps. ICP has recently Q been 
shown to require Q{\D U M\ log \ D U M\) iterations for certain adversarial inputs; however, these rarely occur 
in practice. Furthermore, Pottmann et. al. fTO*!, have shown that ICP has linear convergence when it is close to 
the optimal solution and a point-to-point matching is used. However, ICP has quadratic convergence when using 
a point-to-surface or other similar matching criterion as described in IT9l or 1181 . The lower bounds clearly hold 
for TrICP. The upper bounds, in terms of convergence rates, intuitively hold, but the reduction seems a little more 
complicated. Such a proof is outside the scope of this paper. 



5 Data Generation Model 

In order to formalize the expected mathematical properties of the FRMSD measure and the FICP algorithm, we now 
state some fairly general assumptions about the input data. All data on which FICP is used need not these exact 
properties, but we hope that these properties are general enough that whatever differences exist in the alternative 
data will not significantly affect the following analysis and the resulting conclusions. 

Since data come from a measurement process that might generate spurious measurements as well as miss valid 
ones, we do not require every data point to have a corresponding model point, or viceversa. Specifically, we assume 
that data points are generated from model points by the following abstract procedure: 

1. Generate a set Mj of model points that will have corresponding data points (the subscript / stands for "inlier"). 

2. For every model point m G M/, let 

p = T~^{m + n) 

be the corresponding data point, where T is a transformation in the set T and n is isotropic Gaussian noise 
with standard deviation a. The set of data points p corresponding to M/ is denoted as Dj. 

3. Generate a random set Dq of data outliers. 

4. Generate a random set Mq of model outliers out of a spatial Poisson process. 

We let D = Dj U Dp and M = Mj U Mq- Let pi be the fraction of data inliers relative to all data points. The 
detailed spatial statistics of data outliers are irrelevant to our analysis. The Poisson process for model outliers is a 
minimally informative prior. We let the density of this process be uj points per unit volume. 

The probability density of the squared magnitude z = ||np of the correspondence noise is a chi square density in 
d dimensions: 
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where 



is the gamma function. In particular, 



/•oo 

Jo 



T(x) = / t^-^e-^dt 



r(o) = 

r(i) = 1 

r(n) = (n — 1)! for integer n > 1 

r(l/2) = « 1.77245 

1 • 3 • 5 • • f2n — ll 
r(n + 1/2) = i ^ for integer n > 0. 

2" 

The expected number of model outliers in a region of space with volume V is equal to uV. 
Suppose now that the correct geometric transformation T G T is applied to data point p to obtain the transformed 
data point 

q = T{p) = m + n 

(see stepElin the data generation model above). 

If q and m correspond, their distance statistics are chi square. If q and m do not correspond, the situation is more 
complex: Either point (or both) could be an outlier, or they could be non-corresponding inliers. We do not know the 
distance statistics for model inliers. In the remainder of this section, we assume that the probability that a data inlier 
is nearest to a non-corresponding model inlier is negligible. Under this assumption, the probability density of the 
distance r from q to the nearest outlier, given that model outliers are from a spatial Poisson process with density uj 
points per unit volume, can be shown to be 

w;(r) = w5((i)r'^-^e-'^^('^)^'/'^ for r > 

where 



r(<i/2) 

is the surface of the unit sphere in d dimensions and r(-) is the gamma function. The function w{r) is known as the 
WeibuU density with shape parameter d (equal to the dimension of space) and scale parameter 



1 a dT{d/2) 

So far we have not specified the units of measure. Since cr is a distance and a; is a distance raised to power —d 
(density per unit volume), the parameter au^/'^ is dimensionless. As long as a and oj are properly scaled to each 
other, the analysis that follows is independent of a. 



6 The Value of A 

In this Section we justify a particular choice for the value of A used in the definition of the fractional root mean 
squared distance (FRMSD). 

As shown in Section l3.3.4l the FICP algorithm selects a fraction / of data-model matches in increasing order of 
their residual distances r = \\p — /u(p)|| between data points p and their nearest model points Because of this, 
choosing a fraction / is equivalent to choosing a maximum allowed value r* for the residual distance r. Since we 
would like the FICP algorithm to favor inliers over outliers, it makes sense to require r* to be defined in such a way 
that data points that are r* away from a model point are equally hkely to be inliers as they are to be outliers. Let us 



7 



call such a value of r* the critical distance. We then ask the following question: Is there a value ofX in the definition 
of the FRMSD/or which the value of f chosen by the FICP algorithm corresponds to the critical distance? 

To answer this question, we first express r* as a function of the model parameters (Section lOt . and determine the 
function that relates an arbitrary distance r to the corresponding fraction / (Section ld!2t . We then write an estimate 
of the FRMSD under an ergodicity assumption (Section l63l . This estimate is itself a function of /, and therefore of 
r. The FICP algorithm maximizes the FRMSD with respect to /, that is, finds a zero for the derivative of the FRMSD 
with respect to /. Setting the value of / where this zero is achieved to /(r*) yields an equation for A, whose solution 
set justifies our choice for this parameter (Section l6!4b . 

Our analysis holds for outlier densities u) that are below a certain value Wmax> which is inversely proportional to 
the standard deviation a of the noise that affects the data points. If outliers exceed this density, then matching data 
and model points based on minimum distance is too unreliable to yield good results. 



6.1 The Critical Distance 

Define r* so that a data and a model point at distance r* from each other are equally likely to correspond to each 
other as they are not to. This section derives an expression for r* as a function of the standard deviation a of 
the correspondence noise, the density u of the spatial Poisson process that generates unmatched points, and the 
dimension d of space. 

The volume of a sphere of radius rind dimensions is 

where S{d) was defined in Section|5l The volume of the shell between radii r and r + 6r is 

6Vs = Vs(r + Sr) - Vs(r) = ^ \(r + 6r)'^ - r'^] ^ S(d) r'^~^ 5r . 

d I J 

This approximation is asymptotically exact as 6r —>■ 0. 

The probability mass in the same shell for an isotropic Gaussian distribution with zero mean and standard deviation 
a is 

as (5r — > (the term 2r derives from the Jacobian of the transformation z = r^, since the density is defined for 
the square of a distance) . 

Assume that the center of the shell above is at the transformed data point q defined in Section |5] As explained in 
Section |5l if q and m correspond, their distance statistics are chi squared, and the likelihood of a particular radius r 
is 6Gs/6r. Otherwise, the distance statistics are approximately described by a spatial Poisson process with density 
Lo. Then, the critical distance is determined by the equation 

Lo 6Vs = 6Gs 

that is, 

.d-u^_ Sid) fr^d^lu^V^^ 



ujS(dy-^Sr = , I' - e~2UJ 5r 
which can be simplified to the following: 

e-U^f = cua^(2^)'^/2 . (6.1) 

The left-hand side of equation (I6.lt is strictly positive and monotonically decreasing in r and the right-hand side 
is constant, so the equation admits a solution if and only if 

<L0< 



8 



If the outliers exceed this maximum density Wmax. the critical distance shrinks to zero: any model point around any 
given data point q is more likely to be an outlier than it is to be the model point corresponding to q. Of course, when 
there are no model outliers (uj = 0) the concept of critical distance loses its significance. 

Equation (I6.1I) can be solved for r to yield the desired value of r* as a function of the model parameters: 

= V-2 log,((^/2^a)^^) = J2 log, . 

The critical distance normalized by a and expressed as a function of a = w/ u^max is 

p{a) = = logg a . 

a 

This function is independent of all model parameters and is plotted in Figure^ 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



Figure 1 : Critical distance normahzed by noise standard deviation, plotted versus model outlier density normalized 
by maximum density. 

6.2 Relationship between / and r 

As explained earlier, to every fraction / of data points considered by the FICP algorithm there corresponds a max- 
imum distance r, in the sense that f\D\ data-model point pairs have distance at most r. Consider a particular data 
point p and its transformed version q = T{p). If the data generation process is ergodic, the fraction / equals the 
probability that the nearest model point m to a point q selected at random from the transformed data set T{D) is at 
most r away. 

With probability pj, the data point q has a corresponding model point (inlier). In this event, if rj is the distance 
from this model point and ro is the distance from the nearest model outlier point, the complement of the cumulative 
probability function of the distance r to the nearest model point (either inlier or outlier) is 

1 — F{r) = 1 — P [mill (r/, ro) < r] = V[mm(rj,ro) > r] 
= V[rj > r n ro > t] = V[ri > r] V[ro > r] 
= (1 - V[ri < r]) (1 - V[ro < r]) = (1 - F^r)) (1 - Fo(r)) 

where Fi{r) and Fo{r) are respectively the probability that the matching model point and the nearest model outlier 
are at most r units away from q. From Section|5j these probabilities are as follows: 

2 

Fl{r)= r <7x2(d)(C)dC 
Jo 
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and 

cr 



Fo{r) = / w{p)dp . 
Jo 

Then, if q has a corresponding model point, the density of its distance from the nearest model point is 



, , , dF(r) d , 

= 2rg^2,j,(r^)(l-Fo(r)) + (1 - -F;(r)) u.(r) . 

With probability po = 1 — pi, the data point q is instead an outlier Then, it has no corresponding model point, 
so the probability that the nearest model point is at most r units away is simply Fo{r). In summary, the probability 
density of the distance between a data point q and its nearest model point fi{q) is 

(f)(r) = PI (pdr) + PO w{r) 

and the average fraction of model points within r units from a data point is 

/•r pr 

f{r)= / (j){p)dp=pi / (j)cip)dp + PO Fo{r) . 
Jo Jo 

The derivative of / with respect to r is 4>{r). 

6.3 Ergodic Estimate of the frmsd 

An estimate of the fractional root mean squared distance (FRMSD) can be obtained by assuming ergodically that the 
sample moment included in the definition of FRMSD is close to the corresponding statistical moment: 

^ \\p-f^{p)f^B,eDM\p-f^{p)f]. 

This assumption requires both ergodicity and a sufficient number f\D\ of data points that are close enough to the 
model points. We can then write 

■' ■' ' ' peDf 

1 \ r 

^ -j2x'^P(^Df[\\p- P-{P)\\^] = J2X p'^Hp)dp. 

6.4 Stationary Point of the frmsd Estimate 

At the minimum of FRMSd(£', M, /), the derivative of FRMSD^(Z), M, f) with respect to / is zero. Differentiation 
of the expression at the end of S ection l631 yields 

d — 2A r r"^ dr 

-FRMSd\D,MJ) = p'Hp)dp + px^ir)^- 

Since 

df) dr ^ ' 

the last addend simplifies to r^//^^, and 

d 2A 
/2^-FRMSd2(AM,/) = -- p'Hp)dp + r^. 
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Zeroing this derivative and setting r = r* and / = f{r*) yields the following equation for A: 



{r*rio' Hp)dp 

2 If Hp) dp 

Figure |2l plots the values of A in two and three dimensions as a function of the relative model outlier density 
cj/wmax and for various values of the data inlier fraction pj. 



2.2 r 
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Figure 2: Theoretical value of A in the definition of the FRMSD in two (upper bundle) and three (lower bundle) 
dimensions as a function of the relative model outlier density w/wmax and for various values of the data inlier 
fraction pj. Curves in each bundle correspond to p/ = {0.5, 0.6, 0.7, 0.8, 0.9} from the bottom up. Dependency on 
Pj is weak. 

Since the noise standard deviation a acts merely as an overall scale factor, these plots do not depend on a. It is 
apparent from the figure that A depends weakly on the fraction pj of data inliers. The knees of the plots are at about 
A = 1.3 and A = 0.95 for d = 2 and d = 3 dimensions, respectively, corresponding to cj/wmax = 0.2. These knee 
values are selected as general-purpose values for the definition of FRMSD in two and three dimensions. 



7 Experiments 

The main advantage of FICP over other variants of TCP is that it automatically determines the outlier set via a fraction 
/ and reaches a optimum in terms of the correspondence, the transformation, and the fraction of outliers. In doing 
so, it takes less time than algorithms which have no guarantees, despite searching a larger parameter space. We also 
demonstrate that the radius of convergence for FICP is expanded as compared to TrICP. 

Finally, we deal empirically with the issue of the parameter A used in the definition of FRMSD. We observe that 
FRMSD is robust to the choice of A within a broad range. However the radius of convergence and efficiency of 
FICP is improved when A is set to a slightly higher values than those determined optimal for identifying outliers in 
Section |6l Intuitively, a smaller value of A is more likely to classify correct correspondences as outliers when the 
alignment is not close, and thus get stuck in local minimum. For higher values of A these types of local minimum 
seem less prevalent. So for all performance studies we set A = 3, unless otherwise specified. For this value FICP 
has an expanded radius of convergence and tends to find very similar alignments as when A is set according to the 
analysis in Section |6l After converging, we recommend setting A = 1.3 for d = 2 or A = .95 for d = 3 to identify 
outliers more agressively. This final phase should take very few additional iterations of the algorithm, since, as we 
demonstrate, moderately modifying the value of A has small effects on the frmsd and / values returned. 
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7.1 Data Sets 



We perform many tests on the SQUID fish contour database fTTI from the University of Surrey, UK. This database 
has 1100 2D contours of fish and each contour has 500 to 3000 points. The size of this data set allows us to average 
results over a very large set of experiments. We do not know of any 3D database even close to this size, and it has 
been previously used to evaluate TrICP ||4J. 

We also perform some experiments on a limited number of 3D models. In particular we use the bunny and the 
happy Buddha data set from the Stanford 3D Scanning Repository. 

We synthetically introduce outliers into the data sets in 3 ways. We always begin by creating two copies M and 
D, to represent the model and the input data, of the particular data set. A parameter pj fraction of the final set D are 
left undisturbed as data inliers. 

• Occlusion: We randomly choose a ball B and remove all of the points from M within B. This test represents 
cases where the model set is only partially observed because of occlusions, where there are two overlapping 
views of the same object that do not exactly align, or where the input data D has grown since the model was 
formed. An example is shown in Figure |3l 

• Deformation: We randomly choose a ball B and shift randomly the points D n B. This represents the case 
where D is deformed slightly between time steps. See Figure|4] 

• New data: We add a set of points to D. These points are placed uniformly at random within a bounding box 
of D. This represents outliers caused by some sort of data retrieval noise or from spurious or new data. See 
Figure |5] 

Finally, we always introduce some further noise in the models. For each point p € D, we create a random vector n 
distributed according to a Gaussian distribution with standard deviation a, and we add n to p. 




Figure 3: SQUID example with M in blue suffering from Occlusion noise (left), and D in red (right), pi = .75 

We perform many tests on synthetic data because we then know that a good match exists and it is thus easy to 
quantify the performance on our algorithm. 

Additionally, we perform tests on real scanned data. We align pairs of scanned images of the dragon model from 
the Stanford 3D Scanning Repository from views 24° or 48° apart. Because the different views observe different 
portions of the model, there are many points which have no good alignment in both the model and data set. These 
are outliers. 

7.2 Performance 

For each synthetic data set and type of outliers described above, we perform the following set of tests. We first rotate 
D by 9 degrees where 9 is from the set {5°, 10°, 25°, 50°}. The axis of rotation is chosen randomly for the 3D 
data. We then run ICP, TrICP searching for / with the golden rectangle search, and FICP, minimizing over all rigid 
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Figure 4: SQUID example with M in blue superimposed on D in red with Deformation noise added, pi = .75 

motions. We report the total number of iterations of each, the run time, and the final values of RMSD, FRMSD, and 
/. We vary the input so that pj is either {.75, .88, .95}. We expect that optimally / should be near pj since in our 
data a is small compared to u/tJmax- All experiments were performed on a 3 GHz Pentium IV processor with 1 Gb 
SD-RAM. 

We show in Tabled the average performance of all algorithms on the entire SQUID data set where points are 
removed from M, giving D occlusion outliers with pi = {.75, .88, .95}. Table |2l shows the same where D is 
given deformation outliers with pi = {.75, .88, .95}. Table |3l shows where D is given new data outliers with 
pi = {.75, .88, .95}. TrICP and FICP return similar values of RMSD and FRMSD on average while also determining 
reasonable values for /. However, FICP is about 6x faster than TrICP using the golden ratio search. 



Algorithm 


PI 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


ICP 


.75 


0.335 


24.5 


9.454 


9.454 


1.000 


TrICP 


.75 


1.356 


117.9 


0.217 


0.541 


0.744 


FICP 


.75 


0.178 


13.6 


0.178 


0.424 


0.749 


ICP 


.88 


0.21 


21.4 


4.079 


4.079 


1.000 


TrICP 


.88 


1.032 


107.5 


0.218 


0.364 


0.784 


FICP 


.88 


0.136 


12.3 


0.175 


0.258 


0.878 


ICP 


.95 


0.137 


15.9 


1.338 


1.338 


1.000 


TrICP 


.95 


0.913 


102.4 


0.197 


0.261 


0.904 


FICP 


.95 


0.123 


12.0 


0.175 


0.205 


0.949 



Table 1 : SQUID data with Occlusion outliers, rotated 5° 

The / values when deformation outliers are introduced are noticeably larger than pi because some of the shifted 
points happen to lie very near model points when the two data sets are properly aligned. These points might as well 
be inliers. This phenomenon is less common for the other types of synthetically generated outliers. 

We also ran the same experiments with the same algorithms on the bunny (35,947 points) and happy Buddha 
(144,647 points) data from the Stanford 3D scanning repository. We report the results on the bunny data set in Table 
|4]and for the happy Buddha data set in Table|5lwhere deformation outliers are applied to D and then D is randomly 
rotated by 5°. The numbers are the the results of averages over 10 random rotations. 

Observe in Figure |6l how in the alignment of the bunny data set, the non-deformed points (red points on back 
side, blue points are not visible because they lie exactly behind the red points) are aligned almost exactly by the 
FICP algorithm while the deformed points (shifted from visible blue points in front) are ignored. Such an align- 
ment allows one to easily identify the portion of the data which has been deformed, and by how much it has been 
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Figure 5: SQUID example with M in blue (top), and D with New Data noise added in red (bottom), pi = 



.75 



deformed. Without a proper registration to the model the unaligned points have no point of comparison to gauge 
their deformation. The alignment is skewed when TCP is used and it is not helpful in determining which points are 
deformed. 



7.3 Funnel of Convergence 

We calculate the percentage of cases from the SQUID data set that converge to an FRMSD value within .01 and / 
value within .01 of the alignment between the same sets with no initial rotation. Table|6lshows the results when New 
Data outliers with pi = .88 are added to the data set D. The results for the other types of noise are simlar. For 3D 
data sets we chose a proportionally smaller, so these convergence rates are all slightly larger. Note that FICP with 
A = 3 performs much better than when A = 1.3. 

ICP has a larger radius of convergence than FICP, because it searches a much smaller parameter space. FICP has 
a larger radius of convergence than TrICP even though they search the same parameter space. 

7.4 Validating A 

We empirically justify that FRMSD is not sensitive to the choice of A. We run FICP with A set to {1, 1.3, 2, 3, 4, 5}. 
We plot the averaged results on the SQUID data set when Occlusion noise is added with pi = .75 and D is initially 
rotated 0° and 5° in Tableland Table|8j respectively. Altering A does not dramatically affect the converged solution, 
but can affect the radius of convergence. The output is similar for different types of noise. On 3D data, FICP performs 
shghtly better than 2D data for smaller A. 
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Algorithm 


Pi 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


ICP 


.75 


0.263 


28.9 


1.074 


1.074 


1.000 


TrICP 


.75 


1.103 


114.8 


0.213 


0.404 


0.803 


FICP 


.75 


0.191 


18.1 


0.231 


0.402 


0.810 


ICP 


.88 


0.215 


24.4 


0.829 


0.829 


1.000 


TrICP 


.88 


1.065 


112.8 


0.213 


0.335 


0.827 


FICP 


.88 


0.148 


14.2 


0.178 


0.241 


0.903 


ICP 


.95 


0.168 


19.6 


0.569 


0.569 


1.000 


TrICP 


.95 


1.020 


111.6 


0.203 


0.281 


0.900 


FICP 


.95 


0.138 


13.3 


0.174 


0.197 


0.959 



Table 2: SQUID data with Deformation outliers, rotated 5° 



Algorithm 


PI 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


ICP 


.75 


0.461 


26.7 


5.820 


5.820 


1.000 


TrICP 


.75 


1.578 


92.9 


0.176 


0.399 


0.768 


FICP 


.75 


0.264 


13.7 


0.175 


0.388 


0.766 


ICP 


.88 


0.286 


23.7 


4.061 


4.061 


1.000 


TrICP 


.88 


1.351 


108.0 


0.202 


0.309 


0.831 


FICP 


.88 


0.183 


13.1 


0.172 


0.246 


0.888 


ICP 


.95 


0.192 


19.5 


2.626 


2.626 


1.000 


TrICP 


.95 


1.135 


108.3 


0.205 


0.295 


0.893 


FICP 


.95 


0.148 


12.6 


0.171 


0.197 


0.953 



Table 3: SQUID data with New Data outliers, rotated 5' 



Algorithm 


PI 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


ICP 


.75 


60.1 


78.8 


0.66682 


0.66682 


1.000 


TrICP 


.75 


136.5 


172.2 


0.00523 


0.01239 


0.750 


FICP 


.75 


16.5 


17.3 


0.00522 


0.01237 


0.750 


ICP 


.88 


29.6 


48.0 


0.45303 


0.45303 


1.000 


TrICP 


.88 


147 A 


224.3 


0.00522 


0.00767 


0.880 


FICP 


.88 


13.7 


15.9 


0.00522 


0.00767 


0.880 


ICP 


.95 


13.8 


31.3 


0.37207 


0.37207 


1.000 


TrICP 


.95 


77.6 


162.8 


0.00523 


0.00610 


0.950 


FICP 


.95 


8.0 


14.2 


0.00523 


0.00610 


0.950 



Table 4: bunny with Deformation outliers, rotated 5° 
7.5 Aligning Scanned Model Data 

Finally, we perform experiments aligning real scanned range maps from 3D models. We consider aligning two scans 
from the Stanford 3D scanning repository of the dragon model. We take scans from the dragonStandRight data set 
and we align consecutive scans (24° apart), as seen in Table|9l and next-to-consecutive scans (48° apart), as seen in 
Table fTOl We first rotate the later scan by 24° or 48° to bring the scans into the approximately correct alignment. We 
then align them with ICP TrICP and FICP 

For most alignments both FICP and TrICP realize an alignment with a much lower FRMSD value than ICP. And 
occasionally, FICP noticeably outperforms TrICP in this regard as well. FICP is usually about as fast as ICP, and is 
consistently about 5 to 10 times faster than TrICP. Notice how as the solution for FICP has / approach 1, then FICP 
gracefully approaches the result of ICP with very little noticeable overhead. 
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Algorithm 


Pi 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


ICP 


.75 


430.8 


66.6 


0.56145 


0.56146 


1.000 


TrICP 


.75 


727.3 


101.0 


0.00123 


0.00291 


0.750 


FICP 


.75 


139.7 


15.6 


0.00119 


0.00282 


0.750 


ICP 


.88 


109.2 


28.2 


0.29745 


0.29745 


1.000 


TrICP 


.88 


485.4 


120.5 


0.00120 


0.00177 


0.880 


FICP 


.88 


81.4 


15.7 


0.00119 


0.00174 


0.880 


ICP 


.95 


126.1 


45.4 


0.29351 


0.29351 


1.000 


TrICP 


.95 


405.2 


123.7 


0.00120 


0.00141 


0.950 


FICP 


.95 


66.3 


14.6 


0.00119 


0.00139 


0.950 



Table 5: Buddha v/ith Deformation outliers, rotated 5 



Algorithm 


A 


5° 


10° 


25° 


50° 


ICP 




0.999 


0.997 


0.994 


0.962 


TrICP 


3 


0.875 


0.870 


0.853 


0.816 


FICP 


3 


0.952 


0.945 


0.909 


0.875 


FICP 


1.3 


0.857 


0.473 


0.141 


0.060 



Table 6: Percentage of SQUID data sets converging per initial rotation. 



Algorithm 


A 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


FICP 


1 


0.142 


10.38 


0.158 


0.225 


0.701 


FICP 


1.3 


0.069 


3.81 


0.170 


0.248 


0.749 


FICP 


2 


0.059 


3.06 


0.170 


0.303 


0.750 


FICP 


3 


0.061 


3.17 


0.170 


0.404 


0.750 


FICP 


4 


0.062 


3.21 


0.171 


0.538 


0.751 


FICP 


5 


0.063 


3.30 


0.172 


0.717 


0.751 



Table 7: FICP for different values of A with D rotated 0°. 



Algorithm 


A 


time (s) 


# iterations 


RMSD 


FRMSD 


f 


FICP 


1 


0.733 


37.23 


0.298 


1.503 


0.274 


FICP 


1.3 


0.488 


36.44 


0.219 


0.563 


0.660 


FICP 


2 


0.244 


17.00 


0.176 


0.329 


0.740 


FICP 


3 


0.198 


13.59 


0.178 


0.424 


0.749 


FICP 


4 


0.194 


13.28 


0.184 


0.570 


0.751 


FICP 


5 


0.200 


13.66 


0.299 


0.875 


0.756 



Table 8: FICP for different values of A with D rotated 5°. 

Figure shows the alignment of the scan at 0° aligned with the scan at 48° using ICP and FICP. Notice how 
when the scans are aligned with ICP, the points in the dragon's tail are slightly misaligned, whereas with FICP, the 
alignment is much better. This is confirmed in Table ITOl 

8 Conclusion 

In considering the common problem of aligning two points sets under a set of transformations, we specifically handle 
the problem of outliers. We formalize the distance measure FRMSD (a generalization of RMSD), and we provide an 
algorithm, FICP, to efficiently solve for a local minimum in this distance under a set of transformations, aU possible 
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Figure 6: Stanford bunny with M in blue (top left) and D in red (top right) with Deformation noise with pj = .75. 
Registered using FICP (bottom left) and TCP (bottom right). 

matchings, and the set of outliers. We prove that FICP converges to a local minimum, and that under reasonable 
assumptions on the data, this minimum chooses a set of inliers such that each point selected is more likely to be 
an inlier than an outlier, and each point not selected is more likely to be an outlier than an inlier. On a variety 
of synthetic data and real scanned range maps we show that FICP compares favorably to alternative algorithms 
which are guaranteed to converge — ICP and TrICP. Because this algorithm is a very simple modification of the quite 
popular ICP algorithm and it is compatible with most other recent improvements, we expect that these ideas will be 
integrated into many modern systems. 
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